Continuous Mandarin speech recognition using hierarchical recurrent neural networks
نویسندگان
چکیده
An ANN-based continuous Mandarin base-syllable recognition system is proposed. It adopts a hybrid approach to combine an HRNN with a Viterbi search. The HRNN is taken as a frond-end processor and responsible for calculating discrimination scores for all 411 base-syllables. The Vi-terbi search is then followed to nd out the best base-syllable sequence with highest score as the recognized output. Experimental results showed that the proposed system out-performs the conventional HMM method on both the recognition accuracy and the computational complexity. The system can also be further modiied to reduce the computational complexity while retaining the recognition accuracy almost be undegraded.
منابع مشابه
Speech Emotion Recognition Using Scalogram Based Deep Structure
Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...
متن کاملA Recurrent Neural Network Based Finite State Machine for Fast Continuous Mandarin Speech Recognitionz
In this paper, a novel recurrent neural network (RNN) based nite state machine (FSM) front-end processor is proposed. It is integrated with the DP search into a whole for fast continuous Mandarin speech recognition. The FSM pre-classiies each speech frame into three stable states and a transient state. Diierent search spaces are then set in the following DP search for these four states in order...
متن کاملExponential Moving Average Model in Parallel Speech Recognition Training
As training data rapid growth, large-scale parallel training with multi-GPUs cluster is widely applied in the neural network model learning currently. We present a new approach that applies exponential moving average method in large-scale parallel training of neural network model. It is a non-interference strategy that the exponential moving average model is not broadcasted to distributed worke...
متن کاملAn RNN-based preclassification method for fast continuous Mandarin speech recognition
A novel recurrent neural network-based (RNN-based) frontend preclassification scheme for fast continuous Mandarin speech recognition is proposed in this paper. First, an RNN is employed to discriminate each input frame for the three broad classes of initial, final, and silence. A finite state machine (FSM) is then used to classify the input frame into four states including three stable states o...
متن کاملModular recurrent neural networks for Mandarin syllable recognition
A new modular recurrent neural network (MRNN)- based speech-recognition method that can recognize the entire vocabulary of 1280 highly confusable Mandarin syllables is proposed in this paper. The basic idea is to first split the complicated task, in both feature and temporal domains, into several much simpler subtasks involving subsyllable and tone discrimination, and then to use two weighting ...
متن کامل